319 research outputs found
A fine grained heuristic to capture web navigation patterns
In previous work we have proposed a statistical model to capture the user behaviour when browsing the web. The user navigation information obtained from web logs is modelled as a hypertext probabilistic grammar (HPG) which
is within the class of regular probabilistic grammars. The set of highest probability strings generated by the grammar corresponds to the user preferred navigation trails. We have previously conducted experiments with a Breadth-First Search algorithm (BFS) to perform the exhaustive computation of all the strings with probability above a specified cut-point, which we call the rules. Although the algorithmâs running time varies linearly with the number of grammar states, it has the drawbacks of returning a large number of rules when the cut-point is small and a small set of very short rules when the cut-point is high.
In this work, we present a new heuristic that implements an iterative deepening search wherein the set of rules is incrementally augmented by first exploring trails with high probability. A stopping parameter is provided which measures the distance between the current rule-set and its corresponding maximal set obtained by the BFS algorithm. When the stopping parameter takes the value zero the heuristic corresponds to the BFS algorithm and as the parameter takes
values closer to one the number of rules obtained decreases accordingly.
Experiments were conducted with both real and synthetic data and the results show that for a given cut-point the number of rules induced increases smoothly with the decrease of the stopping criterion. Therefore, by setting the value of the stopping criterion the analyst can determine the number and quality of rules to be induced; the quality of a rule is measured by both its length and probability
Real bad grammar: realistic grammatical description with grammaticality
Sampson (this issue) argues for a concept of ârealistic grammatical descriptionâ in which the distinction between grammatical and ungrammatical sentences is irrelevant. In this article I also argue for a concept of ârealistic grammatical descriptionâ but one in which a binary distinction between grammatical and ungrammatical sentences is maintained. In distinguishing between the grammatical and ungrammatical, this kind of grammar differs from that proposed by Sampson, but it does share the important property that invented sentences have no role to play, either as positive or negative evidence
Generating dynamic higher-order Markov models in web usage mining
Markov models have been widely used for modelling usersâ web navigation behaviour. In previous work we have presented a dynamic clustering-based Markov model that accurately represents second-order transition probabilities given by a collection of navigation sessions. Herein, we propose a generalisation of the method that takes into account higher-order conditional probabilities. The method makes use of the state cloning concept together with a clustering technique to separate the navigation paths that reveal differences in the conditional probabilities. We report on experiments conducted with three real world data sets. The results show that some pages require a long history to understand the users choice of link, while others require only a short history. We also show that the number of additional states induced by the method can be controlled through a probability threshold parameter
From treebank resources to LFG F-structures
We present two methods for automatically annotating treebank resources with functional structures. Both methods define systematic patterns of correspondence between partial PS configurations and functional structures. These are applied to PS rules extracted from treebanks, or directly to constraint set encodings of treebank PS trees
Fuzzy Intervals for Designing Structural Signature: An Application to Graphic Symbol Recognition
Revised selected papers from Eighth IAPR International Workshop on Graphics RECognition (GREC) 2009.The motivation behind our work is to present a new methodology for symbol recognition. The proposed method employs a structural approach for representing visual associations in symbols and a statistical classifier for recognition. We vectorize a graphic symbol, encode its topological and geometrical information by an attributed relational graph and compute a signature from this structural graph. We have addressed the sensitivity of structural representations to noise, by using data adapted fuzzy intervals. The joint probability distribution of signatures is encoded by a Bayesian network, which serves as a mechanism for pruning irrelevant features and choosing a subset of interesting features from structural signatures of underlying symbol set. The Bayesian network is deployed in a supervised learning scenario for recognizing query symbols. The method has been evaluated for robustness against degradations & deformations on pre-segmented 2D linear architectural & electronic symbols from GREC databases, and for its recognition abilities on symbols with context noise i.e. cropped symbols
Trapping dust particles in the outer regions of protoplanetary disks
In order to explain grain growth to mm sized particles and their retention in
outer regions of protoplanetary disks, as it is observed at sub-mm and mm
wavelengths, we investigate if strong inhomogeneities in the gas density
profiles can slow down excessive radial drift and can help dust particles to
grow. We use coagulation/fragmentation and disk-structure models, to simulate
the evolution of dust in a bumpy surface density profile which we mimic with a
sinusoidal disturbance. For different values of the amplitude and length scale
of the bumps, we investigate the ability of this model to produce and retain
large particles on million years time scales. In addition, we introduced a
comparison between the pressure inhomogeneities considered in this work and the
pressure profiles that come from magnetorotational instability. Using the
Common Astronomy Software Applications ALMA simulator, we study if there are
observational signatures of these pressure inhomogeneities that can be seen
with ALMA. We present the favorable conditions to trap dust particles and the
corresponding calculations predicting the spectral slope in the mm-wavelength
range, to compare with current observations. Finally we present simulated
images using different antenna configurations of ALMA at different frequencies,
to show that the ring structures will be detectable at the distances of the
Taurus Auriga or Ophiucus star forming regions.Comment: Pages 15, Figures 14. Accepted for publication in Astronomy and
Astrophysic
- âŠ